Primary Biodiversity Data

Observations of the occurrence of a species are the fundamental unit of biodiversity data. We will explore in this unit where to look for open-access occurrence data, how to access those sources from R, and tools for visualizing point distributions of species.

Library ‘spocc’

A great tool from the rOpenSci consortium (a group of developers building R capacity for open science).

Package details on GitHub

Tutorial here

We should all have spocc installed, but if not try:

install.packages('spocc')

With spocc installed we can try a simple query of the GBIF database that we have seen briefly before.

library(spocc)
ulmus <- occ(query='Ulmus americana', from='gbif')

The data are returned as an “S3 class” object. Somewhere in there is a tidyverse tibble (like a table but not).

print(ulmus) ## Not obvious what or where the data are
View(ulmus)

Maybe it’s still not obvious how we get in. To view an element of the data returned we use the “$” operator and call each by name. In general it’s easier to convert these to regular R data frame objects since not everything we want to do with these data is compatible with the tidyverse/spocc formatting.

df = as.data.frame(occ2df(ulmus$gbif))

#Also try:
#head(df)
#colnames(df) #!! That's a lot of columns!!

mapr: Leaflet mapping of species distribution data.

To create interactive graphics showing species occurrence locations and some metadata we can use ‘mapr’. This library uses a JavaScript library known as leaflet and Open Street Maps services (and others!) to create interactive maps that you can navigate through and click on points to pop-up metadata about each occurrence.

If not already done:

install.packages('mapr')

Then:

library(mapr)
map_leaflet(df)

‘mapr’ shows the data for the first few columns in each pop-up tab. We can control what is shown there by only passing some columns to map_leaflet().

map_leaflet(df[,c('name', 'longitude', 'latitude', 'stateProvince', 'year', 'occurrenceID')])